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Abstract 

Let F 9 be a finite field with q elements with prime power q and let r > 1 be an integer 
with q = 1 (mod r). In this paper, we present a refinement of the Cipolla-Lehmer type 
algorithm given by H. C. Williams, and subsequently improved by K. S. Williams and K. 

Hardy. For a given r-th power residue c £ F g where r is an odd prime, the algorithm of H. 

C. Williams determines a solution of X r = c in 0(r 3 logg) multiplications in F g , and the 
algorithm of K. S. Williams and K. Hardy finds a solution in 0(r 4 + r 2 logg) multiplications 
in Fg. Our refinement finds a solution in 0(r 3 + r 2 logg) multiplications in F ? . Therefore 
our new method is better than the previously proposed algorithms independent of the size 
of r, and the implementation result via SAGE shows a substantial speed-up compared with 
the existing algorithms. 

Keywords : finite held, r-th root, Cipolla-Lehmer algorithm, Adleman-Manders-Miller 
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1 Introduction 

Let r > 1 be an integer and q be a power of a prime. Finding r-th root (or finding a root 
of X r = c ) in finite field ¥ q has many applications in computational number theory and in 
many other related topics. Some such examples include point halving and point compression 
on elliptic curves m , where square root computations are needed. Similar applications for 
high genus curves require r-th root computations also. 

Among several available root extraction methods of the equation X r — c = 0, there are 
two well known algorithms applicable for arbitrary integer r > 1; the Adleman-Manders-Miller 
algorithm [T], a straightforward generalization of the Tonelli-Shanks square root algorithm 
HE) 18] to the case of r-th root extraction, and the Cipolla-Lehmer algorithms [TjlTTJ. Due 
to the cumbersome extension field arithmetic needed for the Cipolla-Lehmer algorithm, one 
usually prefers the Tonelli-Shanks or the Adleman-Manders-Miller, and other related researches 
®mm exist to improve the Tonelli-Shanks. 

The efficiency of the Adleman-Manders-Miller algorithm heavily depends on the exponent 
v of r satisfying r u \q — 1 and r v+1 j q — 1, which becomes quite slow if v « logg. Even in 
the case of r = 2, it had been observed in m that, for a prime p = 9 x 2 3354 + 1, running 
the Tonelli-Shanks algorithm using various software such as Magma, Mathematica and Maple 
cost roughly 5 minutes, 45 minutes, 390 minutes, respectively while the Cipolla-Lehmer costs 
under 1 minute in any of the above softwares. It should be mentioned that such extreme cases 
(of p with p — 1 divisible by high powers of 2) may happen in some cryptographic applications. 
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For example, one of the NIST suggested curve m P-224 : y 2 = x 3 — 3x + b over F p uses a 
prime p = 2 224 — 2 96 + 1. 

A generalization to r-th root extraction of the Cipolla-Lehmer square root algorithm is 
proposed by H. C. Williams m and the complexity of the proposed algorithm is 0(r 3 log q) 
multiplications in ¥ q . A refinement of the algorithm in m was given by K. S. Williams and 
K. Hardy [20] where the complexity is reduced to 0(r 4 + r 2 \ogq ) multiplications in F q . For 
the case of the square root, a new Cipolla-Lehmer type algorithm based on the Lucas sequence 
was given by Muller M- A similar result for the case r = 3 was also obtained by Cho et al. 
[5], and a possible generalization to the r-th root extraction of Miiller’s square root algorithm 
was given in [6]. 

In this paper, we present a new Cipolla-Lehmer type algorithm for r-th root extractions 
in F q whose complexity is 0(r 3 + r 2 logq) multiplications in F g , which improves previously 
proposed results in [T5J [20]. We also compare our algorithm with those in [19, [20] using the 
software SAGE, and show that our algorithm performs consistently better than those in [191 f20| 
as is expected from the theoretical complexity estimation. In m and [20], only the case where 
r is an odd prime was considered but we will give the general arguments (i.e., no restriction 
on r) here. 

The remainder of this paper is organized as follows: In Section 2, we briefly summarize the 
Cipolla-Lehmer algorithm, and introduce the works of H. C. Williams [T9] and K. S. Williams 
and K. Hardy m • In Section 3, we present our refinement of the Cipolla-Lehmer algorithm. 
In Section 4, we give the complexity analysis of our algorithm and show the result of SAGE 
implementations of the three algorithms (in m , m, and ours). Finally, in Section 5, we give 
the concluding remarks. 


2 Cipolla-Lehmer Algorithm in F g 

Let q be a prime power and ¥ q be a finite field with q elements. Let c ^ 0 € F g be an r-th 
power residue in ¥ q for an integer r > 1 with q = 1 (mod r). We restrict r as an odd prime in 
this section. 


2.1 H. C. Williams’ algorithm 

Let b £ F g be an element such that b r — c is not an r-th power residue in F g . Such b can 
be found after r random trials of 6. (See pp.479-480 in j20j for further explanation.) Then 

the polynomial X r — ( b r — c) is irreducible over ¥ q and there exists 9 € F g r — F g such that 

1 <?— 1 

0 r = b r — c. Let lo = 9 q ~ L = (b r — c) r . Then we have u r = 1 where oj is a primitive r-th root 
because b r — c is not an r-th power in ¥ q . 

For all 0 < i < r — 1, using q = 1 (mod r), one has 9 ql = 9 ■ 9 ql ~ l = 9 ■ (0 9-1 ) 1+</+ +q = 
9 uj 1 , which implies ( b — 9) ql = b — 9 q " = b — u l 9. Letting a = b — 9, one has 


r —1 

a^=o qj = (b- 0) 1 + q + q2 +-+ qr - 1 = ]J(6 _ a M) = b r — 9 r = c. (1) 

i =0 


^j=0 Q __ .. 

Thus one may find an r-th root of c by computing a r e F g [(9] = ¥ q [X\/(X r — ( b r — c)). 

Proposition 1. [H. C. Williams] 

Suppose that c ^ 0 is an r-th power in F g . Let 9 r = b r — c with 9 G F ? r and b £¥ q such that 


2 



b r — c is not an r-th power in F ? . Then letting a = b — 0, 


T r_1 
-.1=0 9 


a 




is an r-th root of c. 


The usual ‘square and multiply method’ (or ‘double and add method’ if one uses a linear 


recurrence relation) requires roughly log 


£-3 =o q 


-v=o 


rlogg steps for the evaluation of a 
and therefore the complexity of the algorithm of H. C. Williams is 0(r 3 logg) multiplications 
in Fq. H. C. Williams’ result can be expressed in Algorithm Q] using the recurrence relation 
technique of Section 12.21 


Algorithm 1 H. C. Williams’ r-th root algorithm m 
Input : An r-th power residue c in F 9 

Output : x £ F g satisfying x r = c 


1: do Choose a random b £ ¥ q until b r — c is not an r-th power residue. 

2: M e- 1+( ?+-+g r ~ 1 

r 

3: A e- ( b , —1,0,..., 0) // A is a coefficient vector of a = b — 6. // 

4: A 4 — RecurrenceRelation(A, M) // A is a coefficient vector of a M . // 

5: x £- corresponding element of A // x = a M // 

6: return x 


Note that a = b + 6 is used in the original paper m, while our presentation is based on [20] 
where it uses a = b — 9. We followed m because it is more convenient to deal with general r 
which is not necessarily odd prime. For example, if one uses a = b + 6 as in m , then the case 
of even r (such as r = 2) cannot be covered. Detailed explanations will be given in Section [3j 


2.2 Recurrence relation 

>7*—1 


Given )T)[ = q a*0* £ F g [0], define aj(j) £ F g (0 < i < r — 1, 1 < j) as 


r—1 


^ r—1 




( 2 ) 


i=0 


, i=0 


In particular, one has aj(l) = at for all 0 < i < r — 1. Then one has 


r —1 


/ r—1 


r—1 


^2ai(m + n)0 l = I I | El a j( n )^ 


i=o 


\i=0 
r—1 ( l 


A=0 


r—2 / r—1 


E E°iM a »-i( n ) P 9 * + ( &r - c )E E a j( m ) a hr-j(n) 


1=0 \j =0 


Z=0 \J=Z+1 


which implies 


l r—1 

ai(m + n) = E ttj (v7z)tt/_j (n) -)- (6 c) ^ 1 ttj (m)n/_)_ r _j (?r) 

t=o j=i+i 


(3) 
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for all 0 < l < r — 1. When l = r — 1, the second summation in the equation fl3|) does not happen 
so that one has a r -i(m + n) = Y^j=o aj(m)a r -\-j(n). This recurrence relation is summarized 
in Algorithm [2 


Algorithm 2 RecurrenceRelation(A,M) 

Input : A coefficient vector A = (ao, a\, ■ ■ ■ , a r _i) of a = *= F 9 [0] an d Af G Z + 

Output : A coefficient vector of a M G F 9 [0] 

1: Write M = where M, € {0,1}. 

2: (Bo, Bi , • • • , < (o>0; 1) 

3: for k from [logMj — 1 downto 0 do 
4: (Ao, Ai, ■ ■ ■ , A r _i) 4 — (Bo, Bi, - ■ ■ , R r _i) 

5: for i from 0 to r — 1 do 

6: Bi 4— Ylj =0 AjAi-j + ( b' — c) X)j=i +1 AjA r+ j_j 

7: if iVffc = 1 then 

8: (Ao, Ai, • • • , A r _i) (Roj -Bl) • ■ ■ J -®r-l) 

9: for z from 0 to r — 1 do 

10: Bi G- XJjfcO Ajdi—j + ( b r — c) Y2j=i +1 Aja r _|_i_j 

11: return (Bo,-- - ,B r _ i) 


2.3 An improvement of K. S. Williams and K. Hardy 


Williams and Hardy [20j improved the algorithm of H. C. Williams by reducing the loop length 


A’=o q 


to log q as follows. Write a r (where a = b — 6) as 


E r— 1 7 

j =o q 

a r = E 1 r ■ E 2 , 


(4) 


where 


= ^-u r ~ 2 


E\ = a 


E 2 = aRTuJ 


9 r -i U-i) r 


By noticing that the exponent ^ - Q f E 2 is a polynomial of q with integer coeffi¬ 

cients and using the binomial theorem, one has the following expression of Ei and E 2 as 


r -2 


Ei = \\Xi with X i = (b-u i 6)(- 1)T ‘C i 2 \ 


i =0 
r —1 


„ . , i-c-b'CA 1 ) 

r 2 =n with Y i=( b ~ ^ ^— 


(5) 

( 6 ) 


i =1 

Thus we have the following result of Williams and Hardy. 

Proposition 2. [Williams-Hardy] 

9-1 

(1) Under same assumption as in Proposition 1, E 1 r ■ E 2 is an r-th root of c, where 

v —1 




Ei = a 


E 2 = 


4 















(2) Ei and E 2 can be efficiently computed using the relations 


r—2 


r —1 


Ei = l[{b- *(V), E 2 = JJ(6 


— u 


. , i-(-D i ( r 7 1 ) 

-r- 


i=0 


i=l 


Algorithm 3 Williams-Hardy r-th root algorithm [20] 
Input : An r-th power residue c in 

Output : x £ Fq satisfying x r = c 


1: do Choose a random b £ F g until b r — c is not an r-th power residue. 
2: w (b r — c ) V", where # r = b r — c. 

3: -£7i •< — 1, E 2 <— 1 
4: for i from 1 to r — 1 do 


5: Xi<-(b- w i - 1 0) ( ” 1)r 1+1 1 2 ), Yi 4— (b — ijj r ~ i ~ l 9) 

6: Hi •(— Ei ■ Xi , E 2 <— E 2 ■ Yi 

7: A i — coefficient vector of Ei 
8: A 4— RecurrenceRelation(A, 2—^) 

9: E'i •<— corresponding element of A in F g [0] 

10: x <— E[ ■ E 2 
11: return a: 


r 


The complexity of computing each of Xi in the equation © is of 0(log q)+0(r)+0 (r 2 log ( r f 2 )) 
multiplications in ¥ q . Hence all Xi can be computed in 0(r log q+r 4 ) F g -multiplications. Since 
the 0(r) multiplications of all Xi (0 < i < r — 2) in F^ need 0(r 3 ) multiplications in ¥ q , the 
total complexity of computing Ei (as a polynomial of 9 degree at most i — 1) is 0(r log q + r 4 ) 
Fg-multiplications. Similarly the complexity of computing E 2 is also 0(r log q + r 4 ) ¥ q - 

multiplications. For a detailed explanation, see [20]. Since the exponentiation E 1 r (using 
the recurrence relation) needs 0(r 2 log 2^) = 0(r 2 log q) multiplications in ¥ q and since the 

g-i 

multiplication of two elements E 1 r and E 2 needs 0(r) multiplications in F g (because only 
the constant term of the 9 expansion is needed), the total cost of computing an r-th root of c 
using the algorithm of K. S. Williams and K. Hardy [20] is 0(r 2 log g + r 4 ). 


3 Our New r-th Root Algorithm 


In this section, we give an improved version of the Cipolla-Lehmer type algorithm by general¬ 
izing the method of m- Our new algorithm is applicable for all r > 1 with <7 = 1 (mod r). 
Throughout this section, we assume that r is not necessarily a prime. Thus u = 9 q ~ 1 = 

q—1 

(i b r — c) r may not be a primitive r-th root of unity even if b r — c is not an r-th power in 
Fg. Consequently a more stronger condition is needed for the primitivity of ui. That is, a; is a 

r. 

primitive r-th root of unity if and only if ujp 1 for every prime p\r, which holds if and only 

g-l <?-i 

if {b r — c) p 1 for every prime p\r. From now on, we will assume that (b r — c) p 1 for 
every prime p\r and therefore w is a primitive r-th root of unity. 

Let a € F^. Then, by extracting r-th roots from the following simple identity 

a r (l-a- a 1+q • • • a l+q+q2+ - +qr ~ 2 ) 9 = (l ■ a ■ a 1+q • • • a l+q+q2+ - +qr ~ 2 ) a 1+q+ - +qT ~\ 
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one may expect that a ^1 • a ■ a 1+q ■ ■ • a 1+9+,rH r equals a 9 r up to r-th roots 

of unity. In fact, they are exactly the same element in ¥ q and can be verified as follows; 


1 + qH -h<? r 1 I Q L ) r 

a r = a r = a ■ a r 


Err n v-D 

= a ■ a r = a ■ a r 

= a .( a E& 1 E}- 

= a- (l-a- a 1+q ■ ■ ■ 0 f+ q + q2 +-+q r 


(7) 

( 8 ) 
(9) 

( 10 ) 


Proposition 3. [Main Theorem] 

i?-i 

Let q = 1 (mod r) with r > 1 and let (h r — c) p 7 ^ 1 for all prime divisors p of r. Then letting 
a = b — 9 where 9 r = b r — c, 


a ■ 


(l ■ a ■ a 1+q ■ ■ ■ a 1+q+q2+ - +qr ~ 2 


7-1 

r 


is an r-th root of c. 

Based on the above simple result, we may present a new r-th root algorithm (Algorithm 2J) 
of complexity 0(r 2 log g + r 3 ) with given information of the prime factors of r. It should be 
mentioned that our proposed algorithm is general in the sense that r can be any (composite) 
positive integer > 1 satisfying q = 1 (mod r), while r was assumed to be an odd prime both 

in m and m- 

Both in (19] and [20], b was chosen so that 10 = (b r — c)~^~ / 1, and since r is prime, oj is 
automatically a primitive r-th root. This property guarantees the validity of the equation ([!]), 
namely 

{b - 0){b - u9)(b - u 2 9) ■■■(&- oj r ~ l 9) = b r - 9 r = c. (11) 

However if r is composite, then lo = (b r — c ) £ E" is not a primitive r-th root in general. In 
fact, letting s > 1 be the least positive integer satisfying u s = 1 , the degree of the irreducible 
polynomial of 9 (where 9 r = b r — c) is s because 

qT- 1 _ ( 0 <?-i)' 2 s- 1 + 9 s- 2 +---+g+i _ UJ q a ~ 1 +q a ~ 2 +---+q+^ _ ^ 


and one has 

(6 -9)(b-w9)---(b- uj r ~ 1 9) = {(b -9){b-oj9)---(b- £ n s ” 1 6»)}^ = ( b s - 9 S )^ c (12) 

if s < r. Therefore the methods of m and m do not work for a composite r unless one 
assumes the primitivity of oj. 

q — 1 

Also, even if one assumes the primitivity of oj = (b r — c)~^~ , one still has some problems 
both in m and [2Q], which will be explained in the following remarks. 

Remark 1. In fUfj. a = b + 9 was used (instead ofb — 9) under the assumption of 9 r = c — b r 
with (c — b r ) 3 ~^ r ~ / 1. If we choose a = b + 9 following m, then we get 

(b + 0){b + U 0)...(b + oj r ~ 1 9) = b r - {-9) r = b r + (-1 ) r+1 9 r . (13) 


Therefore ifr is odd prime (as was originally assumed in M), one has b r + 9 r = c and the r-th 
root algorithm is essentially same to the case a = b — 9. However when r is even (for example, 
when r = 2), the original method in TlW cannot be used because b r + (—l) r+ 1 0 r = b r — 9 r 7 ^ c. 
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Algorithm 4 Our new r-th root algorithm 
Input : An r-th power residue c in F ? 

Output : x £ F q satisfying x r = c 

1: do Choose a random b £ Fq until ( b r — c) 2- ^ is a primitive r-th root of unity. 
2: u <— ( b r — c) V, a «— b — 6 where 9 r = b r — c. 

3: P <— a, A ■(— a, W £- 1 

4: for i = 1 to r — 2 do //A, P € F 9 [0] and IT € ¥ q // 

5: W «- Wu, V <- b - W9 //W = u\V = b-u i e = a qi // 

6 : A^AV,P^PA //A = a 1 +i + -+i\P = a-a 1 +‘i---a 1+ ‘i + -+ < i i // 

7: B 4— coefficient vector of P 
8: B •<— RecurrenceRelation(P, 2— 

9: P ■(— corresponding element of B in F g [0] 

10: X i — CK • P // X £¥ q / / 

11: return x 


E r—1 7 - 

j—o q 2=1 

Remark 2. T/ie algorithm in JJJ2F ?ieeds Pi and P 2 satisfying a r = E 1 r • P 2 . However 

l—(—i) i ( r_1 ) 

/or composite r, P 2 cannot be well-defined in some cases, since the exponent —-—-—— m 
t/ie equation ([6JJ is uoi an integer in general. That is, the property (—= 1 (mod r) 
only holds when r is prime. Therefore the algorithm in 120 ) / fails to give the answer when r is 
composite such as r = 4,6,9, • • • . /i.e., w/ien r = 4, one has P 2 = so t/ie coefficient \ 

of q in the exponent is not an integer and one cannot compute P 2 J The problem of P 2 being 
undefined is unavoidable even if one assumes the primitivity of u. 

4 Complexity Analysis and Comparison 

4.1 Complexity analysis 

An initial step of the proposed algorithm requires one to find a primitive r-th root w in F 9 . 

q — 1 

When r is prime, one only needs to find b satisfying oj = (b r — c)~ 0,1 and the probability 

that a random b satisfies the required property is }.+0(q~ 4 ) ([20] pp.480) under the assumption 

1 r 

of r < qi. When r is composite, one further needs to check whether ujp 1 for every prime 
divisor p of r. Since the complexity estimation 0(r 3 log g) in [19] and 0(r 2 log g + r 4 ) in [20J 
still hold if one assumes that a primitive root uj = (b r — c) 2 "^ is already given, we will also 
assume that a primitive root w is given in our estimation for a fair comparison. 

At each z-th step of the for-loop of our proposed algorithm, step 5 needs 1 ¥ q multiplication. 
In step 6, the computation AV needs 1 F g r multiplication which, in fact, can be executed with 
2r ¥ q multiplications because V = b — ui l 6 is linear in 9. The computation PA needs 1 F g r 
multiplication which can be executed with r 2 ¥ q multiplications. Therefore, at the end of the 
for-loop, one needs at most (r — 2)(1 + 2r + r 2 ) < (r + l) 3 F g multiplications (of order 0(r 3 )). 
Since the exponentiation P 3 ^ (in steps 7-9) needs 0(r 2 log g) F g multiplications, the total 
cost of our proposed algorithm is 0(r 3 + r 2 logg) multiplications in F g . On the other hand, 
the cost of Algorithm 0] [T9] is 0(r 2 log = 0(r 3 log g), and the cost of Algorithm [3] [20] 
is 0(r 4 + r 2 logg) where 0(r 4 ) comes from the cost of computing E\ and E 2 in steps 4-6 of 
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Algorithm [3J The theoretical estimation shows that our proposed algorithm is better than 
Algorithm [3] as r gets larger. 

Finally, when r = 2, the for-loop can be omitted in our algorithm so that one only needs 

q— 1 

to compute P ■ P 2 which is exactly same to the original Cipolla-Lehmer algorithm. 

4.2 Implementation results 

Table Q] shows the implementation results using SAGE of the above mentioned two algorithms 
and our proposed one. The implementation was performed on Intel Core i7-4770 3.40GHz with 
8 GB memory. 


Table 1: Running time (in seconds) for r-th root algorithms 


r 

3 

4 

43 

101 

211 

Algorithm [T| [19] 

0.467 

fail 

2026.962 

Interr. 

Interr. 

Algorithm [3] [20] 

0.254 

fail 

53.849 

535.043 

3956.433 

Our proposed algorithm 

0.253 

0.355 

48.359 

256.601 

1098.401 


For convenience, we used prime fields F p with size about 2000 bits. Average timings of the 
r-th root computations for 5 different inputs of r-th power residue c £ F p are computed for the 
primes r = 3,43,101,211. As one can see in the table, our proposed algorithm performs better 
than the algorithms in m and [20] . The table also shows that our algorithm gets dramatically 
faster than other algorithms as r gets larger. For example, when r = 101, our algorithm is 
roughly 2 times faster than Algorithm [3l and when r = 211, our algorithm is 4 times faster 
than Algorithm^ For r = 101,211, the SAGE computation were interrupted after 3 hours for 
Algorithm [T] 

5 Conclusions 

We proposed a new Cipolla-Lehmer type algorithm for ?’-th root extractions in F q . Our algo¬ 
rithm has the complexity of 0(r 3 + r 2 log q) multiplications in F q , which improves the previous 
results of 0(r 3 logq) in [19] and of 0(r 4 + r 2 logq) in [20]. Our algorithm is applicable for any 
integer r > 1, whereas the previous algorithms are effective only for odd prime r. Software 
implementations via SAGE also show that our proposed algorithm is consistently faster than 
the previously proposed algorithms, and becomes much more effective as r gets larger. 
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